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Abstract: We consider an asexual population under strong selection- weak 
mutation conditions evolving on rugged fitness landscapes with many local 
fitness peaks. Unlike the previous studies in which the initial fitness of the 
population is assumed to be high, here we start the adaptation process with 
a low fitness corresponding to a population in a stressful novel environment. 
For generic fitness distributions, using an analytic argument we find that 
the average number of steps to a local optimum varies logarithmically with 
the genotype sequence length and increases as the correlations among geno- 
typic fitnesses increase. When the fitnesses are exponentially or uniformly 
distributed, using an evolution equation for the distribution of population 
fitness, we analytically calculate the fitness distribution of fixed beneficial 
mutations and the walk length distribution. 
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Adaptation is an evolutionary process during which a population improves 
its fitness by accumulating beneficial mutations. A population of genotypic 
sequences produces a suite of mutants and if better mutants become avail- 
able, a maladapted population may acquire one of the beneficial mutations 
provided it does not get lost due to genetic drift. The fitter population in 
turn may acquire another advantageous mutation and the process goes on 
until the supply of beneficial mutations gets exhausted. A number of models 
with variable degrees of biological consistency ha ye been proposed and inves- 
tigated to understand the process of adaptation (IMiller et all 1201 ll) . One 
of the simplest mathematical models was introduced by Gillespie in wh ich 
beneficial mutations arise sequentially and fix rapidly ( Gillespie! 199l[ ). If 
the mutation rate is small and the selection coefficient is large (compared 
to the inverse population size), it is a good approximation to assume that 
only the one-step mutants are accessible at any time and the population is 
localised at a single genotype. Such a monomorphic population performs an 
adaptive walk by moving uphill on a fitness landscape until no more beneficial 
mutations can be found. 

In the last few years, much of the work on Gillespie's model has fo- 
cused on the first step in the adaptation process. If the fitness of the wild 
type and its one-mutant neighbors are rank ordered with the fittest sequence 
at the top , the well established theory o f extremes of independent random 
variables (IDavid and NagarajaI . 120031 ) can be exploited to obtain useful 
information provided the wild type has a high fitness (rank). For a mod- 
erately high ranked initial fitness, Orr calculated the exp ected rank at the 
first step assuming exponential-like fitness distributions ( OrrI . 2002 ). His 
prediction has been tested in an experiment using single-str anded DNA and 



found to be roughly consistent with the experimental data ( IRokyta et al. 
2005h. This r e sult h as been later generalised for other fitness distrib utions 



(|JOYCE et all 120081 ) and by including correlations among fitnesses ( IQrr 



20061 ) . However as the properties of the entire walk are required to design a 



drug or a biomolecule (IBull and OttoI . 120051 ) and as experimental data on 
multiple adaptive substit utions is becoming available (IRokyta et a/I 12009 : 



Schoustra et all 120091 ). it is important to extend the existing theory to 



address the statistical properties of the entire walk. 

With this aim, we study Gillespie's mutational landscape model on rugged 
fitness landscapes with many local fitness optima. An important difference 
between our work and the previous ones is that here we start the adaptive 
walk with low fitness to describe the adaptation process in novel environ- 
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ments such as when antibiotics are introduced ( MacLean and Buckling 



2009 ; McDonald et al. ,_ 201(h wh e reas the init i al fitness is assumed to be 



high in other studies (IGillespieI Il99ll; IOrrI. 2002, 20061 ; IJoyce et al. 



20081 ). Sev eral numerical (IGillespieI 



1991 



Orr. 



2006) and experimen- 



tal studies (IRokyta et all 120091 ; ISchoustra et all 120091 ) have indicated 
that only a few steps are required to reach a local optimum. In a sim- 
ple adaptation model that assumes the mut ational neighborho od to remain 
unchanged during the entire adaptive walk (IGillespieI . 119831 ). the average 
number of steps to a local fitness peak has been calculated analytically for 
various fitness distributions a nd shown to increase logarith mically with the 
rank of the initial sequence (INeidhart and KrugI . l201lh . However here 
we work with a more realistic mutation scheme in which a new suite of 
mutants is created in each adaptive step. For generic fitness distributions, 
we argue that the average number of adaptive steps increases logarithmi- 
cally with sequence length with a prefactor that depends on the choice 
of fitness distribution. Although our argument does not capture the pro- 
portionality constant correctly, the logarithmic dependence is seen to be 
in excellent agreement with the simulation results. We also present de- 
tailed results on the statistical properties of entire walk for exponentially 
and uniformly distributed fitnesses as these two distributions lend them- 
selves to an analytic treatment and are a l so consistent wi t h the experiments 



( IEyre- Walker and Keightleyi. 120071; IRokyta et all 120081 ). Following 



the approach of IFlyvbjerg and LautrupI (119921 ). we write a recursion re- 
lation for the fitness distribution of fixed beneficial mutations at an adaptive 
step which is valid for long sequences and fitness distributions with a finite 
mean. A similar distribution has been calculated in the clonal interference 



regim e in which multiple mutants are produced per generation (IRozen et al. 



20021 ) while here we work in the weak mutation regime. For the above men- 



tioned distributions, we also find the distribution of walk length. The average 
walk length calculated using this approach gives a prefactor consistent with 
the numerical results. 

Although for most of the article we work with uncorrelated fitnesses 
and assume that the distribution of the fitness does not change during the 
course of evolution, the effect of correlations is also discussed. As experi- 
ments support an intermediate degree of corr e lation s in fitness landscapes 



( ICarneiro and HartlI , |2010| ; IMiller et all , 120111 ) and changin g fitness 



distributions may be modeled by correlated fitnesses ( IOrrI . 120061 ). we cal- 
culate the average number of steps to an optimum on a fitness landscape 
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generated by the block model of correlated fitnesses in which a sequence is 
divided into several indepen dent blocks and correlatio n s aris e when two se- 
quences share some blocks (IPerelsqn and Mackeni . Il995l ). The average 
w alk length ha s been measured using numerical simulations in a block model 
in IOrrI ( 120061 ) and it was speculated that the average number of adaptive 



steps is independent of the underlying fitness distribution and increases lin- 
early with the number of blocks. We show that while the latter result is 
roughly correct, the average number of steps to a local optimum is not in- 
dependent of the fitness distribution which is a consequence of the result 
discussed above for the uncorrelated fitness landscapes. 

MODELS AND METHODS 

Uncorrelated and correlated fitness landscapes: An uncorrelated 
fitness landscape can be generated by assigning a fitness to a sequence inde- 
pendent of that of other sequences. The fitnesses are sampled from a common 
distribution p(f) with support on the interval [I, u}. Although the full distri- 
bution of absolute fitness is unknown, one can obtain an insight into its nature 
through the distribution of beneficial mutat ions which has been measured in 



sever al theoretical and experimental studies (IEyre- Walker and Keightley 



20071 ). A theoretical argument suggests that since good mutations are rare, 
their distribution is go verned by the upper tail of the fitness distribution 
p(f) (IGillespieI . Il99lh . It is known from the extreme value theory (EVT) 
for independent and identically distributed (i.i.d.) random variables that 
the asympto tic distribution of the extreme value can be one of the following 
three types (IDavid and NagarajaI . 120031 ): Frechet for algebraically decay- 
ing underlying distributions, Gumbel for unbounded distributions decaying 
faster than a power law and Weibull for bounded distributions. In order to 
be consistent with this result, we make the following choices for the fitness 
distributions: 



P(f) 




5>2 (Frechet) (1) 

7 > (Gumbel) (2) 

v > 0, / < 1 (Weibull) (3) 



The condition 5 > 2 in ([I]) is imposed to keep the transition rate finite (as 
explained later). The last two fitness functions (j2J) and ([3]) are of particular 
interest as several experimental results on the distr ibution of beneficial muta- 
tions have been found to lie in the Gumbel domain (IImhof and Schlotterer 
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2001 



2006 



San juan et al, 2004; Rokyta et al, 2005; Kassen and Bataillon 



MacLean and Buckling! [20091) and a recent work finds a best fit for 



the distribution of b eneficial effects to a u niform distribution which lies in 
the Weibull domain ( IRokyta et all 120081 ) . 

We also study adaptive walks on correlated fitness land s capes which are 
generated using a block model (IPerelson and Mackeni Il995l ) in which 
a sequence of length L is divided into B blocks of equal size L B = L/B. 
The block fitness is an i.i.d. random variable chosen from the distribution 
p(f) and the sequence fitness is obtained on averaging over the fitnesses 
of the blocks in the sequence. If two sequences share one or more block, 
their fitnesses are correlated. The correlations can be tuned by changing the 
number of blocks: If the number of blocks B = 1, sequence fitnesses are 
completely uncorrelated while B = L gives strongly correlated fitnesses. It 
should be noted that the extreme value distribution of correlated fitnesses 
may change from t he co r respo nding; i.i.d. class even if correlations are weak 
( IJain et all 120091 ; IJaini . 1201 if ). In the following discussion, we assume that 
the sequence fitnesses are uncorrelated and deal with the correlated fitnesses 
in the last subsection of this section. 

Adaptive walk model for long sequences: We work with haploid 
binary sequences of length L in the strong selection-weak mutation (SSWM) 
regime. If N is the population size, the SSWM regime corresponds to 
Ns 3> 1, Nfx <C 1 where s is the selection coefficient and fi is the muta- 
tion probability per locus per generation. Since the expected number of 
mutants produced per generation is much smaller than one, mutations occur 
sequentially and double and higher mutations may be neglected. Thus the 
mutational neighbourhood of a sequence is limited to L mutants which are 
single mutation away from it. If the fitnesses of the wild type sequence and 
its L one-mutant neighbors are arranged in a descending order with the best 
fitness assigned the rank 1, the transition probability that the population 
moves from the wild type with fitness rank i and value fi to a mutant with 
rank j < i and value fj is proportional to the fixation probabil ity which is 
well approximated by 2(fj — fi)/ fi in the strong selection limit (IGillespieI . 



199ll ). The normalised transition probability from fitness fi to fitness fj is 
given by 

T(f 3 <- f? - fl 1 



1 < j < i - 1 



(4) 



Sfc=l fk f 

Once the population has moved to a mutant sequence with fitness fj with 
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probability T(fj <— /j), it produces a set of new mutants which are rank 
ordered and chosen according to OH) and the process repeats itself until the 
population reaches a local optimum whose nearest neighbors are all less fit 
than itself. Note that the parameters N and p, have dropped out of the 
picture and the properties of the model depend on the sequence length (or 
the initial rank) and the distribution of sequence fitnesses. 

The m odel described above has been studied using (HP and EVT in previ- 
ous works (IGillespieL ll99lL IOrrI . I2002L l2006t I Joyce et aR l2008t ) assuming 
the initial fitness to be high (small i). In contrast, we start with a low fitness 
and write a recursion relation for the probability Pj{f) that an adaptive 
walk has at least J steps and th e fitness is / at the Jth step, following 
Flyvbjerg and LautrupI ( 1992[ ) who studied this distribution for random 
adaptive walks (see Appendix A). In the following discussion, it is assumed 
that the sequence length is large which allows the following two simplifica- 
tions: first, the events in which a sequence is backtracked can be ignored 
and second, the transition rates can be written in terms of absolute fitnesses 
instead of fitness ranks. Consider a population at the Jth adaptive step and 
with fitness h. It can proceed to the next step provided at least one fitter 
mutant is available. If q(h) = f, dg p(g), this event occurs with a probability 
1 —q L {h) where it is assumed that at each step in the evolutionary process, L 
novel mutants are available which have not been encountered before. While 
this is true at the first step, the number of novel mutants is L — 1 at the 
second step since one of the mutants is the parent sequence itself which is 
not an allowed descendant as the walk always proceeds uphill. In fact for any 
J > 2, some of the mutants have already been probed but the error intro- 
duced by ig noring this complication is of the o rder of 1 /L which is negligible 
for large L ( IFlyvbjerg and LautrupI . Il992[ ) . Then for long sequences we 
can write 



Pj+i(f) 



dh p(f)T(f <r- h) (1 - q L {h))Pj{h) , J > 



(5) 



where p(f)T(f «— h) gives the probability that a mutant with fitness / > h 
is chosen. Furthermore for large L, it is a good approximation to replace the 
sum in the denominator of 01]) by an integral and we may write 



T(f <- h) 



f-h 



f>h 



Ih dg (9 - h) p(g) 
Thus we work with absolute fitnesses instead of fitness ranks. 



(6) 

Since the 
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transition probability (EJ) is undefined for slowly decaying fitness distributions 
P(f) ~ f~ 5 > S < 2, we restrict 5 > 2 in (JQ. Using ® in ©, we finally 
obtain 



_ (/ - %(/) 

XT ^ (9 - h ) p(g) 



{l-q L {h))Pj{h) , J>0 



(7) 



Equation (J7J) is the central equation of this article and we will employ it 
to obtain various results on the statistical properties of adaptive walks. In 
the following, we assume the initial condition Po(f) = 5(f) corresponding to 
zero initial fitness. As Pj(f) obeys an integral equation which are harder to 
analyse, we may try to write a differential equation for Pj(f). Differentiating 
(J7j) with respect to /, we get: 



Pj+i(f) 



Pj + i{f) 



f dh U-WW+pW 



f 



+ 



Jh d 9 (9 ~ h) p(g) 
(f ~ h)p"(f) + 2p'(f) 
£ d 9 i.9 ~ h) p(g) 

p(m-q L (f)) 



dh 



If dg (g - /) p(g) 



where prime denotes a /-derivative. On using 

>(/) 



+ 



p(f) 

p(m-q L U)) 

If dg (g - f) p(g) 



l-q L (h))Pj{h) , J>0 (8) 
1 - q L (h))Pj(h) 
Pj(f) ,J>1 (9) 
and (ED in (ED, we find 

Pj+i(f) 



P(f) 
Pj(f) , J > 1 



(pV)\ 
\p(f)J 



(10) 



The first derivative term in the above equation can be eliminated by writing 
Pj(f) = p(f)Pj(f) which finally yields 



p'Uf) 



p(/)(i-V(/)) 
// dg (g - f) p(g) 



PjU) , J > i 



(11) 



In this article, we will restrict our attention to exponentially and uni- 
formly distributed fitnesses as these two fitness distributions are consistent 
with the available empirical data. We show that due to (fTTT) . a second order 
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ordinary differential equation is obeyed by a generating function of Pj(f) for 
these two distributions which can be solved within an approximation subject 
to the following boundary conditions: 

Pj(f)\f=i = 0, J>1 (12) 

P'Af)\f=l = ruf ] , , fr.l (13) 

Ji dg g p{g) 

where ffT2l) is a direct consequence of ([7]) and the equation ffT3]) arises on 
using the initial condition in ([8]). 

Besides Pj{f), we also find the walk length distribution Qj and the av- 
erage fitness fj at the Jth step which can be related to Pj(f) as explained 
below. Integrating over / on both sides of (0), we get 

Pj+i = J U dfP J+1 {f) (14) 

= F dh rl 7 m d\ \ (i-q L (h))Pj(h) (is) 

Ji Jh J h dg{g - h)p(g) 

dh(l-q L (h))Pj(h) = Pj- J dhq L (h)Pj(h) (16) 

Then the walk length probability Qj that exactly J steps are taken is given 
by 

Qj = Pj- P J+1 = J dh q L (h)Pj(h) (17) 

with Qq = since the initial fitness is zero. The above equation has a simple 
interpretation: Since Pj{h) is the probability that at least J steps are taken 
and the fitness at the Jth step is h, exactly J steps will be taken if all the L 
mutants of the sequence at the Jth step carry a fitness smaller than h from 
which ffTTj) follows. The average walk length J = Y^j=o ^Qj ~ X] jLo ^Qj 
for large L. The average fitness fj is defined as fj = f,df fPj(f). Using 
(J7j), we can write 

fj + i = [ U dff f dh ru U ~ k)P l:!\ s (1 ~ q L (h))Pj(h) (18) 
Ji Ji J h dg{g - h)p(g) 

f\ u (l-q L (h))Pj(h) r 

= / dh ~r r T~( 7VTT / dff(f-h)p(f) (19) 

Ji Jh dg{9 ~ h)p{g) J h 
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Note that neither ([17]) nor (TTOT) are closed equations. 

Our analytical results are also compared with numerical simulations which 
were performed usi ng an exact procedure for L < 10 and an approximate 



method outlined in IOrrI (120021 ) for larger L. We refer the reader to Ap- 
pendix B for details. 

RESULTS 

Average fitness and walk length for general fitness distributions: 

For a broad class of fitness distributions, the average fitness for an infinitely 
long sequence can be computed. Although this limit is biologically unreal- 
istic, it provides a good approximation to the average fitness fj for small J 
(see Fig. [TJ) as the population can not sense the finiteness of sequence length 
far from the local optimum. On taking the limit L — > oo in ( II 9p and denoting 
the average fitness in this limit by Fj, we obtain 



v+i 



J h dg (g - h)p(g) 



Algebraically decaying fitness distributions: On substituting ([T]) in (|2"U1) and 
performing the integrals involving p(f), we get 



dh 2 + 3 1)k Pj(h)\ L ^ = j^ + j^Fj ,5>3 (21) 



where we have used that Pj\l^oo = 1 due to (Tl6|) and the initial condition 
Pq = 1. Repeated iteration with F = yields 



5-3 



" 1 (22) 



which increases geometrically with J. This result is compared in Fig. [TJi with 
the average fitness for finite sequences which shows that the number of steps 
up to which fj and Fj match increases with L. 

Exponential fitness distribution: For fitness distributions given by (T2]), the 
equation for Fj does not close except for 7 = 1. For p(f) = e~f , we get 
Fj = 2 + Fj_i which gives 

Fj = 2 J (23) 

Fig. [Tb shows that the rate of increase of fitness fj is slower than a constant 
at larger J's. 
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Bounded fitness distributions: A calculation similar to above for p(f) in 
gives 



Fj+i 



2 + vFj 
2 + u 



and therefore 



Ft = 1 — 



v 



2 + u 



(24) 
(25) 



For uniformly distributed fitness {y — 1), we find that 1 — Fj = 3~ J in good 
agreement with the numerical data in Fig. [I] for small J. 

We now give an argument to estimate the averag e walk length J using the 
abov e results for the average fitness Fj and the EVT ( IFlyvbjerg and Lautrup 
19921 ). We first note that since Pj\l^oo = 1 for all J, every step in the adap- 
tive walk is definitely taken for infinitely long sequences and hence the average 
walk length is expected to diverge with L. For a sequence of finite length, 
the adaptive walk stops when the population has reached a local optimum 
whose fitness is the largest among L + 1 i.i.d. random variables. But since 
the average number of fitness es with value > f is given by (L + 1)(1 — q(f)), 
at a local optimum we have ( ISornetteI |2000| ) 



(L + 1) / df p(f) = 1 



(26) 



where we have approximated fj by Fj. The above equation yields 

1 (Algebraic) 

(Exponential) 
~ (Bounded) 




(27) 
(28) 
(29) 



On matching the expected fitness Fj with the Fj obtained in the above 
discussion for various distributions, we get 



J 



1 



InL 



-InL 

2 

1 InL 



2+u \ 



(Algebraic) 

(Exponential) 
(Bounded) 



(30) 

(31) 
(32) 
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Thus the above argument shows that for large L, 

J^alnL (33) 

where the prefactor a depends onp(f). We note that a a igebraic < ^exponential < 
abounded which implies that smaller number of substitutions occur for fat- 
tailed fitness distributions than the bounded ones. To understand this qual- 
itative trend, consider the transition probability for the first step given by 
T(f O)p(f) ~ fp(f)- At large /, this probability is higher for slowly 
decaying distributions and thus a large fitness gain occurs initially But as 
the probability to exceed the high fitness achieved at the first step is small, 
the walk terminates sooner for broad distributions. 

The results of our numerical simulations for J shown in Fig. [5] are in 
agreement with the logarithmic dependence on L but the value of the pref- 
actor does not match with that obtained above (except for p(f) = e~*). 
The prefactor a is expected to interpolate between the two limiting cases of 
adaptive walks namely greedy walk in which the best mutant is chosen with 
probability one and random adaptive walk in which all better mutants are 
chosen with equal probability. The f ormer limit is obtai ned when 8 — > 1 in 



and the latter when v — > in ([3]) (I Joyce et aZ.L 120081 ) . Since the average 



walk length for a greedy walker is a fin ite constant equal to e — 1 ~ 1.718 for 
infinitely long sequences (Qrr, 2003h . the prefactor a = while a = 1 for 
random adaptive walk (see Appendix A). In the following sections, we find 
that a = 1/2 for exponentially distributed fitness and 2/3 for the uniform 
ca se which are consistent with t he results in Fig. [2] and the analytical results 
of INeidhart and KrugI (120111 ) which are obtained using a simpler version 
of the adaptive walk model considered here. 

Fitness distribution at the first step for general distributions: If 
the whole population is assumed to have an initial fitness /o, using Po(f) = 
5(f - fo) in © we have 

p i{f) = pT-, 7 \ oc (/ - fo)p(f) (34) 

Ji dg gp{g) 

The above fitness distribution at the first step is nonmonotonic for all fitness 
distributions in ([I])-© except for truncated distributions with v < 1. The 
implications of this result are examined in DISCUSSION. 

Entire walk with exponentially distributed fitness: For p(f) = 
e~f , from ( TTTT) we obtain 

P'.Uf) = (i-q L (f))PAf) ,J>i (35) 
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where q(f) = 1 — e ■* . Due to (TT21 and (TT3"|) . the boundary conditions are 
Pj(0) = and P;(0) = 5j, x . 

We define a generating function G(x, f) = J^^Lx Pj(f)x J , x < 1 which 
obeys the following second order ordinary differential equation: 

G"(xJ) = x(l-q L (f))G(xJ) (36) 

To arrive at the above equation, we have used that Pi(f) = f which is ob- 
tained on using the initial condition in (j7J). The generating function G(x, f) 
obeys a Schrodinger equation for the wave function o f a particle in a one- 



dime nsional potential V(f) ~ 1— q L (f) and energy zero (IMathews and Walker 
1970l ). Since l — q L (f) ~ 1 — e~ Le 1 is close to unity for / <C lnL and vanishes 



for / 3> InL, the potential V(f) decreases smoothly from one to zero and 
moves rightwards with increasing L. Similar potentials also arise when two 
materials with different transport properties are joined together and in such 
systems, an an alytical solution is obtained within a step function potential 
approximation (IBlonder et a/.l . [l98^ ; IScHAEYBRQECK and LazaridesI . [20091 ) 



We follow this approach here and approximate the distribution 1 — q L (f) by 
the Heaviside theta function 0(/ — /) where / = InL. Within this step 
distribution approximation, we have 

G"(x,f) = i xG{xJ) > f< l (37) 
[0 , f>f 1 ; 

For f < f, the differential equation (f3"7|) has a solution of the form 
G<(x,f) = a + e^*f + a_e~^f which reduces to G<(a;, /) = csinh(- v /x/) 
since G(x, 0) = due to Pj(0) = 0. Since the solution for / < / can not 
depend on /, we appeal to the infinite sequence length limit to fix the pro- 
portionality constant c. As noted earlier, the distribution Pj\l^oo = 1 for all 
J > which implies that 

POO 

/ dfe-*G < (x,f) = - (38) 

Jo 1 - x 

and therefore 

G<(x,f) = y/x smh(y/xf) (39) 

We check that the boundary condition P'j(0) = P'j{0) = 5j t i which is equiv- 
alent to G'(x,0) = x is also satisfied by the above solution. 
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For f > f, the solution G>(x, f) = af + b where the constants of in- 
tegration a, b can be fixed by matching the solutions G< and G > and their 
first derivative at / = /. Thus the constant a and b are determined by the 
following conditions: 

G K {xJ) = G > (xJ) = af + b (40) 
G' < (xJ)\ f =f = G> > (x,f)\ f=f = a (41) 

A simple algebra shows that 

G>(z, /) = zcosh(v£/)(/ - /) + y/xSmh(y/x~f) (42) 

Using the above expressions for G(x, f), the fitness distribution Pj(f) for 
the fixed beneficial mutations can be calculated. On expanding (139]) and (142!) 
in a power series about x = and picking the coefficient of x J , we have 

e -f pJ-i i\ , r < 1 

Pj(f) = w^- x \ {2j -%r- 2) \r>i (43) 

where r = ///. Figure [3] shows our numerical results for Pj(f) for the first 
few adaptive steps. As the walk proceeds, the distribution moves rightwards 
as expected and its amplitude decreases since the probability q L (f) that the 
walker can not find a better neighbor approaches unity with increasing /. Our 
analytical result (143!) is also shown in Fig. [3] for comparison. For L = 10 3 , 
the step distribution approximation used to find (143!) gives 1 — q L (f) ~ 1 for 
/ < InL = 6.9 and zero otherwise. However as the probability 1 — q L (f) 
stays close to unity for / < 5 and decreases gradually to zero when / w 
12, the distribution (T43l in the region 5 < / < 12 does not match well 
with the simulation results but outside this crossover region, we see a good 
quantitative agreement. We also note that the fitness distribution does not 
move appreciably for J > 4 and is centred around / ~ 7 (see inset of Fig. [3]). 
This is because the average walk length for L = 10 3 is about 4.6 steps (refer 
Fig. [2]) and as the local optimum is approached, the fitness distribution of 
fixed beneficial mutation remains centred close to the typical fitness of the 
local optimum given by (|26j) which is InL w 6.9. This also explains the initial 
linear rise in the average fitness followed by a slower increase in Fig. [TJ 

We next calculate the walk length distribution Qj defined by (1171) . Since 
q L (f) = 0(/ — /) within the step distribution approximation discussed above, 
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(JT7J) reduces to 

Qj= I dfPj(f) III) 



J > (45) 



On integrating Pj(f) given in (T43]) . we get 



Qj = e" 



In I, 



(lnL) 2J " 2 (InL) 27 - 1 



+ 



. (2J-2)! (2J-1)!_ 



This expression is compared with numerical results in Fig. H] and shows a 
reasonable agreement. The average number of adaptive steps calculated using 
fj45|) is given by 

J=J2 J Qj~2 lnL (46) 

j=i 

which is in good agreement with the simulation result in Fig. The width 
of the distribution Qj measured using the variance a 2 = J 2 — J 2 ~ lnL/4 
also increases with L. 

Entire walk with uniformly distributed fitness: Forp(/) = 1, since 
Pj{f) = Pj(f)i the differential equation fTTT]) reduces to 



W £ pm _ 2(l-/ L ) 
f}dg(g-f) Af) (1-fY 



Pj + i(f) = ,i - r 7, PAD = J PAf) >J > 1 (47) 



with boundary conditions -Pj(O) = and P'j(0) = 28j t i. As before, we define 
a generating function G(x, f) = J2'j ) =2 xJ 2 Pj(f) which obeys the following 
second order ordinary differential equation: 

G"(x, f) = {xG{x, f) + 2f) (48) 

where we have used that Pi(f) = 2/. We treat this case also within the 
step distribution approximation discussed earlier. Since the probability 1 — 
f L ?al — e~ L ^~^\ we approximate it by a step function Q(f — /) where 
/ = (L — 1)/L. For / < /, we obtain an inhomogeneous second order 
ordinary differential equation with variable coefficients: 

2r 4 f 
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This equation can be solved by standard methods (as detailed in Appendix 
C) to yield 



G < (xJ) = a + (l-fr++a4l-f) a -+u + (f)(l-fr++uM)(l-fT- (50) 
where the exponents 

ai± = (51) 



The first two terms on the right hand side give the solution of the homoge- 
neous equation and the last two terms are the particular integral involving 
the variational parameters u±(f) given in Appendix C. The constants of in- 
tegration a± can be obtained using the boundary conditions G(x, 0) = and 
J df G < (x, f) = (1 — a;) -1 . After some straightforward algebra, we find that 



G<(x,/) = — 
x 



(i - f_r+ - (i - /)°- 



+ / 



(52) 



We verify that the condition Pj(0) = for J > 1 which amounts to G'(x, 0) = 
is also satisfied. For / > /, as G > (x, f) = 0, the solution G > (x, f) = af + b 
where a, b can be determined using (HOT) and (|41j) to give 



G>(x,f) 



-2 

x 

2 

x 



Mi-/) 



a--l 



Mi-/) 



a+— 1 



a + — a>- 

i _ j T+ _ (i _ f Y 



f 



«-/(! - /) tt -" 1 + «+/(l-/) tt+ ( " 5 1 3 | ) 



a + — ol- 



Explicit expressions for Pj(f) for first few adaptive steps are given in Ap- 
pendix C and a comparison between the analytical and the simulation results 
is shown in Fig. [5j 

To find the walk length distribution Qj = jj df Pj(f), we define 



oo „x 

H(x) = y^x J Qj = xQi +x 2 / dfG>(x,f) 
j=i J f 



x(l - f) 

a_ — a + 



(2 _ a+ )(i -/)«+- (2 -«_)(!-/> 



(54) 
(55) 
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As an explicit expression for Qj is rather unwieldy, its derivation and the 
expression itself are given in Appendix C and a comparison with the simu- 
lations is shown in Fig. |6j The average number of steps is given by 



- _ dH{x) 

U - 



dx 



- 61n < 1 - / i (56) 



9 



which shows that for large L, the number of adaptive steps grows as (2/3) In L 
in agreement with the numerical results shown in Fig. |2j The higher moments 
can also be found straightforwardly and we find that the variance J 2 — J 2 ~ 
(10/27) lnL and the skewness of the distribution decays slowly as (InL) -1 / 2 . 

Effect of correlations on the number of adaptive steps: We now 
turn to a discussion of adaptive walk properties when the fitnesses are cor- 
related and given by a block model. We compute the average number Jb{L) 
of adaptive steps given by XljLi JQj(L, B) where Qj(L, B) is the probabil- 
ity that exactly J adaptive mutations occur when a sequence of length L is 
divided in B blocks. 

Consider the distribution Q(mi, tub) which gives the joint probability 
that the zth block of length Lb in a sequence of length L carries m; adaptive 
mutations where i = 1,...,B. An important pr operty of the block model 



is th at this joint distribution factorises, that is ( IPerelson and Macken 
1995h 



B 

Q(m 1 ,...,m B ) = l[Q mb (LB,l) (57) 

6=1 

where Qj(L B , 1) = Qj(L B ) is the walk length probability when the fitnesses 
are uncorrelated and the sequence length is Lb- The above equation expresses 
the fact that the block fitnesses evolve independently. As only one mutation 
occurs in the sequence at any step so that all but one block sequence remains 
unchanged and since the block fitnesses are i.i.d. random variables, ( 15 7 jl 
holds. 

Since the distribution Qj(L,B) is given by 

j 

Qj(L,B)= Q(m 1 ,...,m B )5(m 1 + ... + m B - J) (58) 

mi,...,mj=0 
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it follows that 



Jb{L) 



oo J J—rriB B—l B—l 

E J E Q™ b (Lb) \[Qm b {L B )8{Y,m b -{J-m B )) 

J=\ mg=0 mi,...,ms— 1=0 b=l b=l 

oo J 

^ Qm B (L B )Qj-m B (L — Lb, 5 — 1) 

J=l mg=0 
oo oo 

^ Q m (L — Lb, B — 1) J2(n + m)Q n (L B ) 



m=0 



n=0 



J(L B ) + Y,mQ m (L-L B ,B-l) 

m=l 

J(L B ) + J B ^(L-L B ) 
BJ{L B ) 



(59) 



where we have used that ^^L Qj(L,i?) = 1 and J is the average number 
of steps in the adaptive walk for uncorrelated fitnesses. Figure [7] shows the 
results of our numerical simulations for average walk length when the block 
length L B = L/B is kept fixed and the block fitnesses are exponentially and 
uniformly distributed. For fixed L B , (159|) predicts that J B increases linearly 
with B which is in excellent agreement with the numerical data. 
For large L, due to we have 



J B (L) w aB\n(L/B) 



(60) 



For small B, a linear rise in the average number of steps with the number 
of blocks has been seen numerically for exponential-like distributions and it 
was inferred t hat the mea n walk length is independent of underlying fitness 
distributions (IOrrI . 120061 ) . However as discussed in the previous sections, 
the average number J depends on the fitness distribution p(f) and therefore 
the average J B is also nonuniversal. 

DISCUSSION 

In the last few years, several analytical result s have been obtained for 
the mutational landscape model ( IGillespieL Il99ll ). Howe ver many o f these 



results deal with th e first step in the adaptation process ( IOrrI 120021 . 12006 



Joyce et all 120081 ) and an extension of the theory to full adaptive walk is 
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necessary. Previous studies also assume that the process of adaptation starts 
from a highly fit sequence which is not applicable to situations in which the 
population is subjected to high stress and hence has a very l ow initial fitness 
(IMacLean and Buckling! 120091 ; ImcDonald et a/.l . l2010[ ). In this article, 
we have obtained results for the entire adaptive walk starting from a low 
initial fitness but as discussed below, we expect some of these results to hold 
for moderately high initial fitness also. 

Walk length distribution and average walk length: In previous 
works, the walk length distribution for the greedy walk and the random 
adaptive walk have been studied and found to be universal in that they 
are independent of the underlying fitness distribution p(f). The origin of 



this u niversality property is clear in the light of the results of I Joyce et al. 



( 120081 ) who pointed out that these two models can be obtained as a limit of 
(j4|) which defines the mutational landscape model. For the random adaptive 
walk, the distribution Qj for infinitely long sequence vanishes (see ( 163]) ) 
and the average walk length diverges with sequence length. In contrast, for 
greedy walk, the walk length distribution in the L — > 00 limit decreases 
exponentially fast with J for the greedy wal k as a result of which the averag e 
number of steps turns out to be a constant (IOrrI . 120031 ; IRosenbergI 120051 ) . 

In this article, we have calculated the walk length distribution for ex- 
ponentially and uniformly distributed fitnesses and found the average walk 
length for general fitness distributions. An important conclusion of our study 
is that the average number of adaptive steps increases logarithmically with 
the sequence length with a prefactor smaller than unity if the walk starts 
from zero fitness. Our simulations (not shown) also indicate that if the ini- 
tial rank is of order L, the average number of steps increases logarithmically 
with the rank and with the same proportionality constant as that for the 
zero initial fitness case. Thus for a wild type sequence with initial rank (or 
L) of the order 100, the number of substitutions are expected to be less 
than 5. Although sho r t adaptive walks have be en observed in experiments 



( IROKYTA et all 120091 ; ISchoustra et all 120091 ) . more detailed experimen- 



tal studies testing the logarithmic dependence would be desirable. Although 
a test of the L-dependence of the average walk length may not be experi- 
mentally viable, it should be possible to study the average walk length as a 
function of the initial rank. 

Besides the sequence length, the number of steps to a local optimum 
depend on the underlying fitness distribution and the fitness correlations 
also. If the fitnesses are uncorrelated, as the numerical data in Fig. [2] 
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shows, the prefactor a in (1331) depends on the shape of the fitness dis- 
tribution and therefore a rather detailed knowledge of the full fitness dis- 
tribution (how fast it decays) is required to test this which is presently 
unavailable. However one can discern a trend in the value of «: it de- 



fitness distribution in the Gumbel class ( 


IMHOF and SCHLOTTERER. 


2001 


Sanjuan et al. 


2004; 


Rokyta et al. 


2005; 


Kassen and Bataillon. 


2006 



MacLean and Buckling!. 120091) wi ll regis ter shorter walks than those in 



the Weibull domain (IRokyta et all 120081 ) . As shown here in the block 



model of correlated fitnesses, the average number of adaptive steps increases 
as the number of blocks (and hence fitness correlations) increase. This is in 
accordance with the expectation that on a smooth correlated fitnes s land - 
scape, as the local optima are less common ( IPerelson and MACKENl . ll995l ). 
there is a less chance to get trapped and therefore uph ill walk can last longer 
(IWeinbergerL I1991L IkauffmanL Il993l : IOrrI . l2006j ). 

Distribution of fixed beneficial mutations during the walk: The 
fitness distribution Pj(f) has not been studied in previous theoretical studies 
of adaptive walks in the SSWM limit and here we have computed this fitness 
distribution analytically using the recursion relation (J7|). The fitness distri- 
bution at the first step given by ( 134")) can give a qualitative idea about the 
shape of p(f). For most fitness distributions, P\{f) is expected to be non- 
monotonic but for bounded distributions which diverge at the upper limit 
or the uniform distribution, P\{f) increases monoton ically towards the up- 
per bound. An inspection of the experimental data of IRokyta et al\ (120051 ) 
shows the fitness distribution at the first step to be nonmonotonic which is 
consistent with their assumption of exponentially decreasing distribution of 
be neficial effects. It wo uld be interesting to check if the distribution P\(f) 
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Rokyta et all (120081 ) is monotonic as the data in this study is consistent 



with a uniformly distributed fitness. The above behavior of Pi(f) is expected 
to be robust in the presence of correlations as at the first step in evol ution 



the p opulation has not sensed the correlations in the fitness landscape ( IQrr 



2006|). 



For the fitness distribution for the entire walk, we presented an analysis for 
two distributions namely exponential and uniform which are consistent with 
the available experimental data. The distribution Pj(f) is obtained within 
a step distribution approximation which captures the shape of the fitness 
distribution correctly for the first few steps and leads to an accurate estimate 
of the number of average steps. Our approximation consists of replacing the 
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probability 1 —q L (f) by a step function 0(/ — /) where / is given by (1251) for 
exponentially and ( 12 9 p for uniformly distributed fitnesses. For / <C / and 
/ ^> /, our approximate solution matches the simulation results well for any 
J. With increasing J, the distribution Pj(f) shifts towards higher fitnesses 
and peaks about / for larger J's. As explained earlier, the fitness / is reached 
when J is close to J oc InL and therefore we expect our approximation to 
work well for J <C In L. 

When the underlying fitness distribution is exponential, we find that the 
fitness distribution of the fixed beneficial mutation also has an exponential 
tail (see (jl5|l). The robustness of this result i.e. whether any fitness dis- 
tribution in the Gumbel class exhibits exponential tail for Pj(f) is however 
not clear. For uniformly distributed fitnesses, as the width of the distri- 
bution 1 — q L (f) decreases with increasing L, the step distribution approx- 
imation works better in this case than in the exponential case where the 
width is a constant (compare Figs. 0] and [6]). The properties of multiple 
steps in an adaptive walk have been mea s ured in some recent experiments 
(IRokyta et all . 120091 : ISchoustra et all . l2009h and a detailed analysis of 



the experimental results would be very welcome. On the theoretical front, an 
extension of the results described above to distributions other than uniform 
and exponential would be desirable. We have recently made some progress 
in this direction and the results will appear elsewhere. 

Another interesting question concerns the distribution P{sj) of the selec- 
tion coefficient sj = (fj — fj-i)/fj-i at the Jth step in the adaptive walk. 
As we start with zero fitness, the selection coefficient is defined for J > 2. 
Our preliminary numerical results for P(sj) are shown in Fig. [HJfor the first 
few steps in the walk and we observe that the typical selection coefficient 
decreases as the walk pro ceeds. This behavi o r ma tches qualitatively with 



the experimental results of ISchoustra et all (120091 ) . A theoretical analysis 



of the distribution P(sj) requires the joint distribution of the fitness at step 
J — 1 and J and we hope to address this question in a future work. 
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APPENDIX A: RANDOM ADAPTIVE WALK 
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In this Appendix, we briefly review the known results for random adap- 
tive walk in which all better mutants are chosen with equal probability 
( Macken and PerelsoniIi989I : Iflyvbjerg and LautrupI . Ii992| ; Ikauffman1 



19931 ). The probability distribution Pj(f ) obeys the following recursion re- 
lation ( IFlyvbjerg and LautrupI , Il992l ): 



Pj+i(f) 



dh 



P(f) 



[l-q L (h)]Pj(h) 



(61) 



r 

where q(f) = J t dg p(g). A change of variable from the fitness / to the 
cumulative probability q(f) gives 



Pj+i(q) 



(62) 



Since the walk length distribution for the random adaptive walk also obeys 
(fT71). we have 



dh q L {h)Pj{h) 



dq q L Pj{q) 



(63) 



which shows that Qj is a universal distribution in that it is independent of the 
underlying fitness distribution p(f). Note that for infinitely long sequences, 
the probability Qj = as in the mutational landscape model. Differentiating 
( 162|) with respect to q immediately gives 



dP J+l {q) 1-q 1 



dq 



1-q 



Pj(q) = J2 ( l np ^ 



(64) 



n.=0 



The generating function G(x,q) = X]jLi xJ 'Pj{q) then obeys the following 
first order differential equation: 



G'(x,q)-xP{(q) 



x- 



-G(x,q) 



(65) 



For the initial condition Po(f) = 8(f), we have P\(q) = 1 and due to fl6"2"|) . the 
distribution -Pj(O) = 0. Solving the above differential equation using these 
boundary conditions gives G(x,q) = xe xI iL ^ where H r ,(q) = J^^-i q k /k an d 
hence the distribution Pj(q) is given by (jFLYVBJERG and LautrupI . Il992l ) 



Pj(q) 



H J L -\q) 
(J-l)! 



(66) 
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Since the product q L Pj(q) in (jH3]) peaks around q = 1, using H L (q) ss InL 
for g close to unity for finite but long sequences and performing the integral 
in f )63|) . we get 

where J = InL. Thus the walk length distribution is a Poisson distribution 



in J) with mean J = In L (IFlyvbjerg and LautrupI . Il992h . 



APPENDIX B: SIMULATION PROCEDURE 

For short sequences of length L < 10 and uncorrelated fitnesses, a ran- 
domly chosen sequence was assigned a fitness equal to zero. Then the rest of 
the fitness landscape comprising of 2 L — 1 fitnesses was generated by drawing 
random variables independently from a common distribution p(f). The tran- 
sition probability from the initial sequence to each of the better sequences 
among the L nearest neighbors was calculated according to (j4]) and the fixed 
sequence at the first step in the adaptive walk was chosen. Then the tran- 
sition probability from the chosen mutant sequence to its better neighbors 
was calculated and this process was repeated until a fitter sequence was not 
available. 

To simulate sequen ces w i th len gth L > 10 2 , we followed an approximate 



procedure outlined in IOrrI (120021 ) as the total number of sequences 2 is 
prohibitively large for long sequences. Starting with zero fitness, L i.i.d. 
random variables were generated and a higher fitness / was chosen according 
to the transition probability 01]). During the next step in the process, L new 
i.i.d. random variables were generated and the transition probability from / 
to a better fitness was calculated. These steps were repeated until the new 
set of random fitnesses does not exceed the currently fixed fitness. The block 
model was simulated to generate weakly correlated fitnesses by assigning 
independent fitnesses to each block sequence. In all the simulations, the data 
was collected using 10 6 independent realisations of the fitness landscape. 

APPENDIX C: DERIVATIONS FOR UNIFORMLY DISTRIBUTED FITNESS 

Solution of differential equation [49} The generating function G < (x, /) 
obeys the following inhomogeneous second order differential equation: 

/) - (T^7F G ^ /) = jSjf (68) 
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where we have dropped the subscript for brevity. The general solution of 
such differential equations is a linear combination of the general solution 
Gh{x, f) of the homogeneous equation obtained by setting the right hand side 
equal to zero and the part i cular solution Gp of the inhomogeneous equation 
( Mathews and Walker . 1970h . The homogeneous solution is of the form 



G H (x,f) = a + (l-f) a ++a-(l-f) a - 
where a± are the solutions of the quadratic equation a 2 



a 



(69) 

2x = and 



given by floTj) . The particular solution is found using the method of variation 
of parameters and is of the form Gp(x, f) = u+(x)(l — f) a+ +u-(x)(l — f) a ~ 
where the functions u+{f) obey t he following first order differential equations 
( Imathews and Walker! . Il970h : 



u' + (f)(l-fr++u'_(f)(l-fy 

a + u'+W ~ fT + ~ l + «-«-(/)(! - fY- 

On solving the above equations, we obtain 

4 4(1-/) 



4/ 



Gp(xJ) 



-V 



a + a_ (1 — a+)(l — a_) 



x 



(70) 
(71) 

(72) 



Finally using the boundary conditions in the general solution G < (x,f) = 
Gp(x, f) + Gh{x, f), the desired result is obtained. 

Distribution of fixed beneficial mutations: The fitness distribution 
found using (|52j) and (1331 is given below for the first few adaptive steps: 



Pi(f) 
P 2 (f) 

PaU) 



2/ , / < 1 

-8/ + 4(/ - 2) ln(l - /) 
4/(/+/-2) + 4 (y _ 2 ) ln(l-/) 



/</ 
f>f 



(73) 
(74) 



12/ + ln(l-/)(l2-6/ + /ln(l-/)) , / </ 

^ [6/(2 -/-/) + 2(6 - (6 - /)/ - /(3 - 2/)) ln(l - /) (75) 

+ /(l-/)ln 2 (l-/)] , / >/ 

120/ + 60(2 - /) ln(l - /) + 12/ln 2 (l - /) + (2 - /) ln 3 (l -/),/</ 

60/(2 -/-/)- 12(/(5 - 3/) - 2(5 - (5 - /)/)) ln(l - /) (76) 



(i-/) 

+ 3(/(2 - 3/) + (2 



/)/) ln 2 (l - /) + (2 - /)(1 - /) ln 3 (l - /) , / > / 
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Walk length distribution: On matching powers of x J on both sides in 
(155]) . we get 



e ~(-l + 2e* 



Q 2 = 2e-^(3 + £+(-3 + 2£)e 
Q 3 = e~ 2i [-2(18 + 8£ + £ 2 )+4e £ (9-5£ + £ 2 

Q 4 = <T 



(77) 
(78) 
(79) 



[180 + 84£ + 15£ 2 + £ 3 + e'(-180 + 96£ - 21 f + 2£ 3 )] (80) 



where I = InL. A general solution of Qj by this method does not seem 
possible but an approximate analytic expression for Qj can be obtained as 
explained below. 

From the definition of the generating function H(x) in (155]) . it follows 
that 

_ 1 d J H(x 
Qj 



J\ dx J 



1) 



x=0 



By th e residue theorem for complex variables, we have (IMathews and WalkerI . 
1970h 

— / dz Hz) = -— Uz - z ) n+1 f(z)) 
2m J c J v ' n\ dz n vv ; Jy " 



(82) 



z=z 



where z is a pole of order n + 1 of the function f{z) and the contour C 
encloses the singularities of f{z). From (18 ip and ( I82p . we can write 



dz Ej§l = J_ I dz e K(*) 

z J+l 2m l~ 



(83) 



where K{z) = In H (z) — ( J + 1) In z. We s olve this integral by the method 
of steepest descent which for large J gives (IMathews and Walker] . Il970l ) 



Qj 



2ttK"(z* 



,K{z a ) 



1 



H(z s ) 



2txK"(z*\ z J + l 



(84) 



where prime refers to derivative with respect to z. In the above equation, z s 
is a solution of the equation 



H\z s ) = J_ 
H(z s ) z s 



(85) 
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and 



K"(z s ) 



H{z) 

H'(z) 
H(z) 



J 



+ 



1 H'(z s 
z s H(z s 



(86) 
(87) 



where prime denotes a derivative with respect to z. Since a+ > 0, neglecting 
the exponentially small term in (1 — f) a+ in fl55|) . we get 



H(z) 



-3</2 e <y/2( 3 + y )( y 2_ 1 ) 

lQy 



(88) 



where y = a/1 + 8z. Differentiating H(z) once with respect to z gives 
H'{z) 8(y + 3) + A(2y + 3)(y 2 - 1 ) + 2y(y + 3)(y 2 - 1)£ 



(89) 



H(z) y 2 (y 2 -l)(y + 3) 

Using the above expression in ( )85|) for large y, we get y s ~ AJ/£ and therefore 

2J 2 



On differentiating (189]) once, we have 



H\z) 
H{z) 



+ 



4 + 



L3(y + 3) 2 (l + yf 3y 2 y 3 (1 - 2/) 2 J 
Using ([89]) and ([H]) in (JH7D, we obtain 

8 [-36 + Qy s (y 2 s - 3) + y s (y s + 3) 2 (1 + y s )H] 



(90) 



(91) 



K"(z, 



yKys + my 2 s-i) 



U a 4 



y* 8J 3 



Thus we have 



Q. 



2 J 3 / 2 2-a(z s ) (i_/)i+a-(*.) 

X ; — : — : — r X 



spKi 2 a + (z s ) - a_(z s ) 
where a>± is given by f[5"Tj) . 



.7 



(92) 
(93) 

(94) 
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Figure 1: Evolution of average fitness with the number of adaptive steps 
starting from zero initial fitness obtained numerically (points) and compared 
with the average fitness in infinite sequence length limit (lines) for (a) power 
law distributed fitness with 5 = 6, equation (|22|) (b) exponentially, equation 
( T23"j) and (c) uniformly distributed fitness, equation §25§. 
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Figure 2: Average number J of adaptive steps as a function of sequence 
length L for various fitness distributions when the fitnesses are uncorrelated. 
The points show the data obtained using numerical simulations and the lines 
are the best fit to the function J = alnL + /3. The results for greedy 
walk and random adaptive walk (up to an additive constant) are also shown. 
The numerical fit for the prefactor a for exponential and uniform fitness 
distribution matches well with the analytical results given by (fl5l) and (J55]l 
respectively. 
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Figure 3: Main: Comparison of the distribution Pj(f) for J = 1,2,3,5 
obtained numerically (points) and analytically (lines) given by ( I4"3j) for expo- 
nentially distributed fitness and sequence length L = 1000. Inset: Numerical 
data for Pj(f) for J = 4, 5,6 to show that the fitness distribution does not 
shift appreciably beyond J ~ 4.6 as local optimum with average fitness ~ 7 
is approached. 




Figure 4: Walk length distribution Qj for p(f) = e * comparing numerical 
(points) and analytical result (lines) given by ( 1451) . 
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Figure 5: Comparison of the distribution Pj(f) for J = 1,2,3,4 obtained 
numerically (points) and analytically (lines) given by (I73l)-( J76l) for uniformly 
distributed fitness and sequence length L = 100. The distribution for / < / 
is shown in the main plot and for / > / in the inset. 
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Figure 6: Walk length distribution Qj for uniformly distributed fitnesses 
comparing simulation (points) and analytical result (lines) in ( 1941) . 
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Figure 7: Average number Jb of adaptive steps as a function of block number 
B for fixed L/B = 100. The numerical data is in excellent agreement with 
( )59l) shown by solid line. 




Figure 8: Distribution P(sj) of selection coefficient sj for L = 1000 and 
p(f) = ■ The inset shows the decay in average selection coefficient sj as 
a function of J. The points are joined by line to guide the eye. 
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